Introduction

In this assignment, I would like to see the correlation of diabetes and obesity with physical inactivity in the US in 2017. Thus, I download two datasets which are talking about “diagnosed diabetes among adults aged >=18 years” and “Obesity among adults aged >=18 years” in the US in 2017 from the CDC. They include estimates for the 500 largest US cities and approximately 28,000 census tracts within these cities.


Methods

Read in the data by API

I used API method to obtain my datasets from CDC. First, you have to create an account with password. Then, you have to apply for a free app token. Last, copy your API Endpoint. Both datasets contain 27 columns and 29,006 rows.

Here are my datasets links:

500 Cities: Obesity among adults aged >=18 years

500 Cities: Diagnosed diabetes among adults aged >=18 years


After downloading two datasets, I merge them, remove duplicates and NA values, and add a new column of regions.

Results

Leaflet

From the Leaflet, the legend shows the degree of the diabetes percentage. The red color means higher percentage of diabetes. I see there are more orange dots in the NE region and SE region from the plot of diabetes percentage.

Boxplots

Now, let see the boxplot, the x-axis shows 4 regions: Northeast, Southeast, Northwest, and Southwest. On the y-axis shows the percentage of diabetes or obesity.

From the boxplot of diabetes percentage, the NE region and the SE region have a similar median diabetes percentage. The NW region has the lowest median diabetes percentage. In this plot, the east-side regions’ median diabetes percentage is higher than the west-side regions’.


Scatter plots

In this scatter plot, I select each state’s median of obesity percentage and diabetes percentage. We can see that there is a positive correlation between obesity and diabetes rates.


Conclusion

Question 1: How are the distribution of diabetes percentages in the US?

From the leaflet, first we can see there are more orange dots on the NE and SE regions. From the box plot, the median of diabetes percentage looks equally high in the NE and SE regions. Besides, we can also see there are higher diabetes percentages on the east-side than on the west-side.

Question 2: Is there any correlation between diabetes and obesity?

From the scatter plot, we can see that there is a positive correlation between obesity and diabetes rates by states.


Copyright © 2020, Sam Lu.